Parsing Formal Languages using Natural Language Parsing Techniques
نویسندگان
چکیده
Program analysis tools used in software maintenance must be robust and ought to be accurate. Many data-driven parsing approaches developed for natural languages are robust and have quite high accuracy when applied to parsing of software. We show this for the programming languages Java, C/C++, and Python. Further studies indicate that post-processing can almost completely remove the remaining errors. Finally, the training data for instantiating the generic data-driven parser can be generated automatically for formal languages, as opposed to the manually development of treebanks for natural languages. Hence, our approach could improve the robustness of software maintenance tools, probably without showing a significant negative effect on their accuracy.
منابع مشابه
تأثیر ساختواژهها در تجزیه وابستگی زبان فارسی
Data-driven systems can be adapted to different languages and domains easily. Using this trend in dependency parsing was lead to introduce data-driven approaches. Existence of appreciate corpora that contain sentences and theirs associated dependency trees are the only pre-requirement in data-driven approaches. Despite obtaining high accurate results for dependency parsing task in English langu...
متن کاملAn improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملRelative Clause Ambiguity Resolution in L1 and L2: Are Processing Strategies Transferred?
This study aims at investigating whether Persian native speakers highly advanced in English as a second language (L2ers) can switch to optimal processing strategies in the languages they know and whether working memory capacity (WMC) plays a role in this respect. To this end, using a self-paced reading task, we examined the processing strategies 62 Persian speaking proficient L2ers used to read...
متن کاملMultilingual Semantic Parsing : Parsing Multiple Languages into Semantic Representations
We consider multilingual semantic parsing – the task of simultaneously parsing semantically equivalent sentences from multiple different languages into their corresponding formal semantic representations. Our model is built on top of the hybrid tree semantic parsing framework, where natural language sentences and their corresponding semantics are assumed to be generated jointly from an underlyi...
متن کاملLearning To Parse on Aligned Corpora
One of the first big hurdles that mathematicians encounter when considering writing formal proofs is the necessity to get acquainted with the formal terminology and the parsing mechanisms used in the large ITP libraries. This includes the large number of formal symbols, the grammar of the formal languages and the advanced mechanisms instrumenting the proof assistants to correctly understand the...
متن کامل